WORLD DSfTELLECTUAL PROPERTY ORGANIZATION 
International Bureau 




PCX 

INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) 



(51) Intemationai Patent Classification ^ : 
C12N 15/67, 15/81 



A2 



(II) International PubUcation Number: WO 94/19472 

(43) International Publication Date: 1 September 1994 (01.09.94) 



(21) International Apptication Number: PCT/GB94/00373 

(22) International Filing Date: 25 February 1994 (25.02.94) 



(30) Priority Data: 

9303988.1 



26 February 1993 (26.02.93) GB 



(71) Applicant (for all designated States except US)i THE PUBUC 

HEALTH LABORATORY SERVICE BOARD [GB/GB]; 
61 Colindale Avenue, London NW9 5DF (GB). 

(72) Inventors; and 

(75) Inventors/Applicants (for US only): MINTON, NigeL, Peter 
[GB/GB]; 27 Moberly Road. Salisbury, Wiltshire SPl 
3BZ (GB). FAULKNER, James, Duncan, Bruce [GB/GB]; 
14 Bishops Court, John Game Way, Marston, Oxford 
Oxfordshire OX3 OTX (GB). 

(74) Agent: LOCKWOOD, Peter, Brian; MOD(PE), DIPR, IPRIb, 
Room 2002 Empress State Building, Lillie Road, London 
SW6 ITR (GB). 



(81) Designated States: AU, CA, GB, JP, US, European patent (AT 
BE, CH, DE, DK, ES. FR, GB, GR, IE, IT, LU, MC, Nl] 
PT, SE). 



Published 

Without international search report and to be republished 
upon receipt of that report. 



(54) Title: BI-FUNCnONAL EXPRESSION SYSTEM 
(57) Abstract 

A« over e^icssion system is provided comprising a DNA sequence containing transcriptional and transiational signals that promote 
Se drir^'^S. "^^^^-^^ P^^"^ both in bacterial hosts (e.g., EscheHchia colt) and yeasts (e.g.. Saccharoses cerlvisia.). 

?^tivrrtb/^n!5T'°Al'^T '^^^^'^A '^"'^^ "^^Sy which allows heterologous genes to be directly cloned at a position 

^^^^^^^V^^ which IS optimal for expression. Particularly provided are expression cassettes comj^g a 

° >^ invention combmed with a purpose built series of plasmids wherein the utiHty and efficiency of the resultant e«>ressk>n 
vectors can be demonstrated to over produce protein, particularly phenylalanine ammonia lyase (herein abbreviated to PAL) in eZoH and 
5. cerevisioe to levels hitherto unattainable. ^ i.c/»* 



FOR THE PURPOSES OF INFORMATION ONLY 



Codes used to identify States 
applications under die PCX. 



AT 


Austru 


AU 


Austntia 


BB 


Barbados 


BE 


Belgium 


BF 


Buridna Faso 


EG 


Bulgaria 


BJ 


Beain 


BR 


Brazil 


BY 


Belarus 


CA 




CF 


Ceatral Afiicaa Republic 


CG 


Coogo 


CH 


Switzerland 


a 


Cfitr d'lwire 


CM 


Camcxooo 


CN 


Ouoa 


CS 


CEecfaodovakia 


CZ 


Czecb Republic 


OE 


GcnxuLoy 


DK 


DcocDarlc 


ES 


Spain 


FI 


Ptoiaod 


FR 


France 


GA 


Gafaoa 



party to the PCT on the front pages 



GB 


Uoited Kingdom 


GE 


Georgia 


GN 


Guinea 


GR 


Greece 


HU 


Hungary 


IE 


Ireland 


rr 


Italy 


JP 


Japan 


K£ 


Kenya 


KG 


Kyrgystan 


KP 


Democratic People's Republic 




of Korea 


KR 


Republic of Korea 


KZ 


Kazakhstan 


U 


Ltecfateostein 


LK 


Sri Lanka 


LU 


Luxembourg 


LV 


Latvia 


MC 


Mooaoo 


MD 


Republic of Moldova 


MG 


Madagascar 


ML 


MaU 


MN 


Mongolia 



pamphlets publishing international 



MR 


Mauritania 


MW 


MaUwi 


^fE 


Niger 


NL 


Netberlands 


NO 


Norway 


NZ 


New Zealand 


PL 


Poland 


FT 


Pormgal 


RO 


Romania 


RU 


Russian Federation 


SO 


Sudan 


SE 


Swedox 


SI 


Slovenia 


SK 


Slovakia 


SN 


Senegal 


TD 


Chad 


TG 


Togo 


TJ 


Tajikistan 


TT 


Trinidad and Tobago 


UA 


Ukraine 


US 


United States of America 


vz 


Uzbekistan 


VN 


Viet Nam 



wo 94/19472 



1 



PCT/GB94/00373 



BI-FUNCTIONAL EXPRR SSION RY<?TP M 



The present invention relates to novel promoter DNA, particularly a 
novel expression system comprising DNA having a sequence containing 
transcriptional and translational signals that promote the over 
production of recombinant proteins both in bacterial hosts (eg., 
Escherich i a Cff]l ) and yeasts (eg., Sacchamm vcefi r^n rrvlfflflf) ; and to 
a novel cloning method that allows the insertion of a heterologous 
gene into a vector or expression cassette directly at the authentic 
translational start point of a promoter, with no deleterious changes 
being made to either the native 5'-UTR of a vector promoter or to the 
codons of the inserted gene; allowing production of that promoter 
DNA. The design of the expression system lends itself to this unique 
strategy which allows heterologous genes to be directly cloned at a 
optimal position relative to the transcription /translation signals. 

Particularly provided are expression cassettes comprising a sequence 
of the invention combined with a purpose built series of plasmids 
wherein the utility and efficiency of the resultant expression vectors 
can be demonstrated to over produce protein, particularly that of 
phenylalanine ammonia lyase (herein abbreviated to PAL) , in E. coli 
S.^ — cerevisiflft to levels hitherto unattainable. 

Although considerable progress has been made towards the development 
of expression systems for yeast (reviewed in Rose and Broach, 1990), 
the vectors lack the sophistication and versatility of their bacterial 
coiinterparts . Current vectors often contain many superfluous DNA 
sequences, which make them cumbersome and difficult to amplify and 
isolate in large quantities. The wealth of DNA present means that 
unique restriction sites are limited in number. 



Yeast expression vectors are usually of the "sandwich" variety, 
whereby cloning sites are "sandwiched" between a homologous yeast 
promoter and transcriptional termination signals. The precise 
positioning of the cloning sites with respect to the authentic 
initiating codon (AUG) of the homologous yeast promoter represents 
something of a dilemma. If one chooses to place the cloning sites 
upstream to the AUG, then one inevitably disrupts the native 
5 ' -untranslated region (5*-UTR) of the yeast promoter. Unavoidable 
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insertion of heterologous untranslated sequence elements containing a 
high proportion of G residues, or elements creating secondary 
structures or containing the inserted AUG in a sub-optimal nucleotide 
context, can have catastrophic effects on expression levels, 
regardless of the strength of transcriptional activation signals 
(Donahue and Cigan. 1988; Baim and Sherman. I988) . For example. 
Bitter and Egan (198^1) reported 10 - 15 fold lower expression levels 
of a Hepatitis B surface antigen (HBsAg) gene, fused to a yeast 
glyceraldehyde-3-phoaphate (GPD) gene promoter, but utilising the 
native HBsAg 5' flanking region, compared to HBsAg fused to a GPD 
promoter and utilising the GPD 5' flanking region. 

The alternative is to position the cloning sites immediately 3* to the 
authentic AUG of the yeast promoter. However, this has its own 
concomitant problems. Care must be taken that the fusion is "in 
frame", while the non-authentic amino terminus of the expressed 
protein may have unpredictable effects on its biological activity and 
antigenicity. These last two points render such fusion proteins 
unsuitable for use as a pharmaceutical without modification. 

Preferably cloning is directly from the authentic AUG initiation 
codon. However, there has been no reported instance of a native yeast 
promoter with a usable restriction site encompassing its translational 
start point and the artificial creation of one would inevitably 
disrupt the start codon or its nucleotide context. The alternative is 
the lengthy and expensive procedure of chemically synthesizing an 
oligonucleotide "bridge" fragment that reaches from a convenient 
restriction site in the promoter 5' to the translational start to a 
site 3" to the ATG in the coding region to be expressed. Such a 
procedure is not applicable to a routine, versatile cloning strategy. 

A further disadvantage with currently available yeast expression 
vectors is that as they employ homologous yeast promoters containing 
powerful transcriptional activating sequences and they do not direct 
the efficient transcription or translation in bacterial hosts, such as 
R. coll (Ratzkin and Carbon, 1977; Struhl. 1986) . Similarly, 
bacterial derived transcriptional/ translational signals are 
inefficiently utilised in S rfrfiVisiae. if at all. Comparative 
expression studies of heterologous genes in ELioli and S.Cfirevlsiae 
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therefore require the use of two separate vector systems. 

A preferred aspect of the present invention describes how both the 
specificity and efficiency of a yeast promoter element, particularly 

that of 5j cerevisiaet particularly that of PGK, can be changed to 

direct the high expression of heterologous genes in bacteria and yeasts, 

A first aspect of the invention provides promoter DNA incorporating a 
stiructural gene starting position characterised in that the DNA has a 
unique Sspl restriction site at the structural gene start position, A 
second aspect of the invention provides a novel cloning strategy ie. 
method, that allows the insertion of a heterologous gene into the 
expression cassette directly at the authentic translational start 
point of the promoter, with no deleterious changes being made to 
either the native 5*-UTR of the vector promoter or to the codons of 
the inserted gene; thus providing the promoter DNA of the first aspect. 

A third aspect of the invention provides recombinant DNA comprising a 

yeast promoter sequence, particularly of ^ cejCfiyisiflfi. characterized 

in that the leader region of the promoter sequence is replaced with 
that of the replication protein 2 (REP2) gene (ORF C) of the yeast 2 
ym plasmid (Hartley and Donelson. I98O) . A preferred yeast promoter 
derived portion is that of the phosphoglycerate kinase (PGR) promoter 
and encompasses powerful upstream activating sequences (UAS) (Ogden et 
al.. 1986). responsible for efficient transcription in S.cerevTSTftP. , 
The sequences necessary for efficient transcription in E.coli reside 
in the REP2 derived portion of the hybrid promoter. Sequences 
necessary for efficient translation, both in S.cerevn'fii and E.coli . 
also reside in the REP2-derived portion of the promoter. 

A fourth aspect of the invention provides the promoter hybrid of the 
invention incorporated into an expression "cassette", in which a copy 
of the lacZ' gene, containing the multiple cloning sites of pMTL23, is 
preceded by the promoter, and followed by tandemly arranged, yeast 
gene-derived, transcriptional terminators. 

In the cloning method of the second aspect (illustrated in Figure 1) 
promoter DNA incorporating a structural gene starting position eg. 
within an expression cassette, is modified using SDM by creating a 
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unique Sspl restriction site at a structural gene start position. l..e 
position of this created site is such that the triplet sequence. ATG. 
corresponding to the translational start codon of the structural gene 
becomes ATA within the Sspl recognition site AATATT. The heterologous 
gene to be inserted is similarly modified. In this case the 
nucleotide triplet corresponding to the translational start codon 
(eg., AUG. GUG. or UUG) is changed to CAG, while the triplet 
immediately 5' is changed to CTG. These changes correspond to the 
creation of a PstI restriction site. CTGCAG. The creation of the 
Pstl. or equivalent site, can be conveniently performed simultaneously 
to isolation of the gene by utilising a mutagenic primer in a 
polymerase chain reaction (PGR) catalysed gene amplification procedure 
(Higuchi et al.. 1988). The modified heterologous gene can be then 
digested with PstI restriction endonuclease and the 3* overhanging 
ends removed eg. by the 3' to 5' exo-nucleolytic activity of T4 DNA 
polymerase. The heterologous gene can then be excised using any of 
the restriction enzymes whose sites are present within the polylinker 
of the vector. 

The net result of the actions of these DNA modifying enzymes is that 
the first base of the blunt-ended DNA fragment is the third 
nucleotide. "G". of its first codon. It is then ligated into the 
vector which has been digested previously with Sspl and a restriction 
enzyme compatible with that used to excise the heterologous gene. 
Fusion of the vector promoter region (which ends in "AT") and 
heterologous gene (which begins in a "G") results in the recreation of 
the translational start ♦ ATG. 

The fourth aspect of the invention provides an expression system 
obtainable using the method of the invention such that overexpression 
of proteins is possible. A particular example of this is provided in 
the over expression of phenylalanine ammonia-lyase (PAL) gene from 
n^ninrrnridl.. ronaoide. in both E._^ and r p r.vl5iae . This 
is made possible by incorporating an expression cassette provided by 
the method into a purpose built, unique series of f> r r mvisiae / 
CQli shuttle plasmids. Preferably every component of these shuttle 
plasmids is extensively modified to reduce the presence of superfluous 
DNA in the final vectors and to eliminate nucleotide sequence motifs 
corresponding to the restriction enzyme recognition sites of use xn 
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the operation of the expression cassette. The levels of recombinant 

PAL attained in Sji cerevisiflR are significantly higher than that 

obtained using the PGK promoter alone. Whereas the PGK promoter alone 
fails to elicit the expression of PAL in E.coli . the levels of 
recombinant PAL obtained using the hybrid promoters are far in excess 
of those previously obtained using expression vectors designed for 
high expression in E. coli . 

The DNA, cassettes, and organisms of the present invention will now be 
illustrated by reference to the following non- limiting Figures and 
Examples. Other variations falling within the scope of the invention 
will be apparent to those skilled in the art in the light of these. 

F IGURES and SEQIJENCK LISTING SRorrRMnRg* 

Figure 1: shows the design of the novel cloning method which allows 
cloning to take place directly at the authentic translational start 
point of a promoter • 

Figure 2: shows a comparison of sequences found 5' to the 
translational start codon in REP2 and PGK, compared to a consensus 
yeast sequence, the sequence found 5' to the neo gene and a consensus 
procaryotic promoter sequence. 

Figure 3: shows how genes are inserted into the expression cassette 
using the Sspl site. 

Figure H: shows an overview of the pMTL 8XXX vectors of the invention. 

Figure 5: shows an SDS-PAGE electrophoretogram of lysates derived 
from microbial cells producing recombinant PAL. 

Figure 6: shows the construction of pMTL 8000 and pMTL 8100 by 
inserting a 1.4 kb Rsal fragment from pVTlOO-U, containing the origin 
of replication and the STB locus from the 2 ym circle plasmid, into 
the EcoRV site of pMTLJ and pMTL CJ. 

Figure 7: shows the construction of pMTLCJ by replacing the bla gene 
of pMTL4 with the cat gene of pCM4. modified by SDM-mediated removal 
of EcoRI, Ncol and Sspl restriction sites. 
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SEQ ID No l: shows the complete nucleotide sequence of a novel 
expression cassette (SEQ ID No 1) of the invention including a 
sequence comprising the PGK::REP2 promoter (bases 1-635; 1^2 
fragment consists of bases 547-63t; of SEQ ID No 1). 

SEQ ID No 2: is the complete nucleotide sequence of a 
comparative control cassette, containing the PGK promoter. 

SEQ ID No 3: is the nucleotide sequence of plasmid pWTLSOOO. 

SEQ ID NO k: is the nucleotide sequence of plasmid pWTLSlOO. 

SEQ ID NO 5: is the nucleotide sequence of the URA3-J and ura3-dJ 
(189 =G) alleles used in the vector construction. 

SEQ ID No 6: is the nucleotide sequence of the leu2-dJ allele used in 
vector construction. 



in SEQ ID NO 1. the original nucleotide source DNA sequences have been 
changed by SDM at the following points: 

Creates ClaI::AccI 



Base 5^8-5^9 


was 


TT 


now 


AC 


Base 557 


was 


G 


now 


T 


Base 580 


was 


G 


now 


T 


Base 636-638 


was 


CTG 


now 


ATT 


Base 1033-1035 was 


TAA 


now 


Grrr 


Base 1149 


was 


G 


now 


c 


Base 1223 


was 


T 


now 


A 


Base 1484 


was 


G 


now 


T 



Creates HpaI::BglII 
Removes Clal 
Removes Sspl 
Removes AccI 

i^oco ^ites are provided in regions of DNA as follows 
Restriction endonuclease sxtes are piruvx 

Base 630-640 Sspl ^ ^ c- t 

Base 760-870 Nrul. Stul. Xhol. Bglll. Olal. Sphl. Nccl. Kpnl S.al. 
Sstl. EcoRI. Xbal. Hindlll. Pstl. Mlul. Accl. sail. *atll. Ndel. 
BamHI. EcoRV, Nael. 
Base 1610-1619 Sphl 
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Fusible ends were produced from the source DNA by Clal and AccI for 
the fusion between base 5^6 and 5^7; by Hpal and Bglll at for the 
fusion between base 1035 and 1036 and by Hindlll and Hindi for the 
fusion between base l4l2 and 1^13. 

In SEQ ID No 2, the original nucleotide source DNA sequences have been 
changed by SDM at the following points: 



Base 3 


was 


C 


now 


A 


Creates 


EcoRI 


Base 725-727 


was 


GAT 


now 


AAG 


Removes 


Clal 


Base 768-770 


was 


GTC 


now 


ATT 


Creates 


Sspl 


Base 1165-1167 


was 


TAA 


now 


GTT 


Creates 


Hpal 


Base 1281 


was 


G 


now 


C 


Removes 


Clal 


Base 1356 


was 


T 


now 


A 


Removes 


Sspl 


Base 1616 


was 


G 


now 


T 


Removes 


AccI 



Restriction endonuclease sites are provided in regions of DNA 
follows : 



Base 1-10 EcoRI 
Base 180-190 XmnI 
Base 760-770 Sspl 

Base 890-1000 Nrul, StuI, Xhol, Bglll, Clal, Sphl. Ncol. KpnI. Smal. 
SstI, EcoRI, Xbal, Hindlll, PstI, Mlul. AccI, Sail, Aatll, Ndel, 
BamHI, EcoRV, Nael. 
Base 1740-1751 SphI 



Fusible ends were produced from 
for the fusion between II67 and 
the fusion between 15^3 and 15^^ 



the source DNA using Hpal and Bglll 
1168; and using Hindlll: : Hindi for 



In SEQ ID No 3 derived from pVTlOO-U (bases 1-290 and 2295-3^00) and 
pMTLJ (bases 291-229^), the original nucleotide source DNA sequences 
have been changed at the following points: 

Base 425 was T now C Removes Sspl 

Restriction endonuclease sites are provided in regions of DNA as 
follows : 
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Base 1-10 Sspl 
Base 1360-1370 Dral 
Base 2520-2530 Hpal 

Fusable ends were produced from the source DNA using RsaI::EcoRV for 
the fusion between bases 290 and 291 and using EcoRV::RsaI for the 
fusion between bases 229^ and 2294. 

I. SEQ ID NO 4 derived fro. pVTlOO-U (bases 1-290 and ^f^-^f^^' 
pWTLVCJ (bases 291-426 and 12l4-2l43) and pCM4 (bases 427-1213). the 
original nucleotide source sequences have ben changed at the 
following points: 



Base 676 
Base 976 
Base 985 



was A now G Removes EcoRI 

„as C now A Removes Ncol 

was A now G Removes Sspl 



Restriction endonuclease sites are provided in regions of DNA as 
follows : 

Base 1-10 Sspl 
Base 237O-238O Hpal 

Pusible ends were produced from the using RsaI::EcoRV for the fusion 
between bases 290 and 29I. SspI::BamHI for the fusion ^^^^^^ 
426 and 427, Ba«HI::DraI for the fusion between bases 1213 and 1214 
and EcoRV::RsaI for the fusion between bases 2l43 and 2144. 

in SEQ ID NO 5 the original nucleotide source sequence has been 
changed at the following points: 

Base 150 
Base 289 
Base 440 
Base 563 
Base 1063 

=-,M-oc: are thus absent from this sequence. 
Restriction endonuclease sites are tnus « 



was 


C 


now 


G 


Removes Ndel 


was 


G 


now 


C 


Removes Hpal 


was 


C 


now 


G 


Removes Ncol 


was 


C 


now 


T 


Removes StuI 


was 


G 


now 


C 


Removes Accl 
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In SEQ ID No 6 the original nucleotide source sequence has been 
changed at the following points: 

Base 294 was T now C Removes Clal 

Base 780 was C now T Removes EcoRI 

Restriction endonuclease sites are provided in regions of DNA as 
follows : 

Base 380-390 Kpnl 

E xa a p l ft 1; — E xpression Cassett,ftfi. An example nucleotide composition 
of the expression cassette containing the essential elements of this 
invention is designated SEQ ID 1, and was formed by fusing DNA regions 
from PC3K (base l-3^6 and base 1036-1411). REP2 (base 547-635). lacZ' 
(base 636-1035) and ADHl (base I4l2-l6l9) ; base numbers are those in 
SEQ ID No 1 not source DNA. Prior to fusion, the sequence composition 
of each element was altered to varying extents using site-directed 
mutagenesis (SDM) . In the majority of cases the changes were made 
either to eliminate a restriction enzyme recognition common to the 
polylinker region within lacZ' . or to create a restriction recognition 
site to facilitate the construction of the cassette. To compare the 
advantages of the novel promoter element, a second cassette was 
constructed, which contained no REP2 derived nucleotides, to act as a 
control. The sequence composition of this control cassette is shown 
as SEQ ID No 2. 

The expression cassettes consist of the E. coli lacZ' gene, 
containing the pMTL23 polylinker cloning sites (Chambers et al.. 
1988). sandwiched by nucleotide signals for transcriptional initiation 
and termination. The transcriptional initiation signals of the hybrid 
promoter are provided by a unique combination of sequences derived 
from the promoters of PGK and REP2. The upstream activating sequence 
(UAS) element and TATA-box are from the PCX promoter and are fused to 
the 86 nucleotides residing immediately 5' to the 2 pm plasmid REP2 
gene. The REP2 promoter is constitutive in nature (Som et al. , I988) . 
and not generally regarded as a "strong" yeast promoter. 

Within the hybrid promoter, the REP2 region is also responsible for 
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providing the expression cassette with promoter activity in F ,. CQU - 
The region used contains sequence motifs which exactly correspond to 
those sequences necessary for transcription in procaryotes such as 
CQli. Thus two hexanucleotide sequences are present, TTGACA and 
TATAAT. which exactly correspond to the consensus -35 and -10 boxes of 
p. noli promoters (Harley and Reynolds. 1987). and the spacing 
between them. 18 bp. is also consistent with a functional F. . COl X 
promoter. In addition, the AUG start codon of REP2 is preceded by the 
nucleotide motif -AGAA-. 

The transcription initiation and termination signals flank unique 
restriction enzyme recognition sites into which heterologous genes may 
be inserted; with the exception of Sspl. these sites form part of the 
lacZ' structural gene. Their location within the lacZ' gene allows 
the rapid detection of recombinant clones derived from the plasmxd. 
The lacZ- gene encodes the alpha-peptide of p-galactosidase . such that 
when produced in F.. «li hosts carrying the lacZ delta M15 mutation 
p-galactosidase leads to return of ability to metabolise the 
chromogenic substrate X-Gal and the production of blue colonies on 
agar medium supplemented with X-Gal. The insertion of heterologous 
DNA into the cloning sites of the expression cassette results in the 
inactivation of lacZ' and thus cells transformed with recombinant 
plasmxd therefore produce colourless colonies on agar medium 
supplemented with X-Gal {Vieria and Messing. 1982) . 

The cassette is designed such that heterologous genes to be expressed 
are cloned using the Sspl site and one of the recognition sites from 
within the polylinker. The Sspl site (see list of sites in SE» ID No 
1 and 2 above) is located some 106 nucleotides 5' to the translational 
start of lacZ- and displacement of the DNA normally found between Sspl 
and the polylinker within lacZ' results in recombinant plasmids which 
no longer confer a blue colouration on cells in the presence of X-Gal. 

in the case of the PGK: :REP2 promoter the ATA of the hexanucleotide 
sequence AATATT equates to the ATG start of the REP2 structural gene. 
In the case of the control expression cassette, the same triplet 
corresponds to the ATG start of the PGK structural gene. In both 
cases, when the cassettes are digested with Sspl. the DNA is cleaved 
between the AT and A of the ATA triplet and a blunt-end is generated. 
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A DNA fragment carrying the gene to be expressed is then modified such 
that the first nucleotide of its blunt-ended, 5' -end is the "G" of 
the translations! start codon of the structural gene. The 3 '-end of 
this fragment may have any cohesive end compatible with those chat can 
be generated by cleavage at the hexanucleotide recognition sites 
within lacZ'. Subsequent fusion of the 5' "G" nucleotide of the 
heterologous gene to the "AT" blunt-end of the cassette generated by 
Sspl cleavage creates an ATG which is synonymous with both the 
translational start the heterologous gene and that of the structural 
genes from which the promoter elements were derived, ie. , PGK in the 
case of the control cassette and REP2 in the case of the hybrid 
promoter. 



The net result of the utilisation of this cloning strategy is that no 
changes are made to the nucleotides within the 5' untranslated region 
of the resixltant mRNA. nor are any changes made to the codons of the 
gene being expressed. This would certainly not be true if a 
heterologous gene was merely inserted into the sites located solely in 
the polylinker region. 

The method of choice used to allow the isolation of the heterologous 
gene as a blunt-ended fragment lacking the first two nucleotides of 
the translational start codon involves creating a recognition site for 
the restriction enzyme PstI at the start of the gene such that the 
terminal "G" of the created hexanucleotide sequence CTGCAG corresponds 
to the "G" of the genes translational start* The site created in the 
gene need not be Pstl. but any site conforming to the consensus CNNNNG 
(where "N" is equivalent to, any nucleotide) which is cleaved by a 
restriction enzyme immediately before the "G" nucleotide to give a DNA 
terminus with a 3' overhang, ie. , 3'-NNNN. Similarly, the recognition 
site used in the expression cassette need not be solely restricted to 
that of Sspl, but can be any restriction site conforming to the 
consensus NATTAN (where "N" is equivalent to any nucleotide) which can 
be cleaved by a restriction enzyme between the two "T" nucleotides to 
give a blunt-end. 



One potential problem with this cloning strategy occurs if the 
heterologous gene contains an internal PstI site. Two possible 
solutions are, firstly that the gene be inserted in a "two-step" 
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cloning strategy utilising another internal site 5' to the problem 
PstI site. Secondly, an oligonucleotide can be designed such that xts 
5' end corresponds to the G residue of the ATG translational start 
point. If this oligonucleotide is used in a PGR catalysed reaction to 
isolate the gene of interest, then cleavage with PstI is unnecessary. 
However, the original "PstI" strategy is preferable to this latter 
strategy, since PGR products have frequently been shown to have 
slightly heterogeneous termini (Hemsley et al., 1989). 

f,Tr mn- Pr rrnrn^— FmrPRRinn Vffntnra; A new series of 
vector backbones were constructed (see below) essentially being 
replication regions from the £_i:q1± plasmid GolEl and the yeast 2 pm 
plasmid. For selection in EL£flii they carried either the bacterial 
cat or bla gene, conferring resistance to chloramphenicol (Cm) and 
ampicillin (Amp) . respectively. The markers allowing selection in a. 
.^T^visiae were either the LEU2 or URA3 gene, which convert 
appropriately deficient host strains to prototrophy. In the latter 
case, two alleles were constructed. Plasmids are shown in Fig 4. 

Regardless of the nature of the selectable marker, of bacterial or 
yeast origin, every vector contains a unique Sspl site between the 
bacterial selectable marker and the 2 pm replication origin. It was 
into this site that the expression cassette and control cassette were 
inserted. The former was isolated as a 1.6 kb Xmnl/Sspl fragment, and 
the latter as a 1.75 kb EcoRI/ SphI fragment. Both DNA fragments were 
blunt-ended by treatment with T4 DNA polymerase prior to their 
insertion into the Sspl site. The orientation of insertion was such 
that lacZ' was counter transcribed relative to bla or cat. 

y^,,.^ .K..,.....-.tics: GRM = chloramphenicol resistance marker. Gene 
markers transcribe away from STB but can transcribe toward xt. 
pMTL 8110: CRM. leu-dj gene marker, no cassette. 

pWTL 8120: CRM. a defective S rrrf^vls i ae URA3 gene and no cassette. 
pWTL 8130: CRM. ura3-dj gene marker and no cassette. 
PMTL 8131: CRM. ura3-d3 gene marker and a cassette driven by the 
PGK promoter, 

PMTL 8133: CRM. a defective a. . r p rrv1 .- ^ i ae URA3 gene and an expression 

cassette driven by the PGK:REP2 promoter. 

pWTL 8140: CRM. Ieu2-dj gene marker and no cassette. 
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The vectors contain a minimum of 19 unique cloning sites in addition 
to the unique Sspl sites. Non-unique sites are given in Table 2. 

Evaluation of the Expression Caj^aettf^fl r The capabilities of the 
expression system were initially assessed using the neo gene of the 
transposon Tn903. It encodes aminoglycoside-3 ' -phosphotransferase 
type I (APHl), which confers resistance to the antibiotic kanamycin 
and its analogue G4i8 (Haas and Dowding, 1975). The gene was 
available as a "Genblock (1.5 kb EcoRI fragment) from Pharmacia. This 
fragment was inserted into the EcoRI site of plasmid pUC8 to give 
plasmid pGENBLOCK. PGR was used to amplify a 1.11 kb fragment 
carrying the entire structural gene. During PGR the design of the 
oligonucleotide employed as the primer to the 5' end of the gene was 
such that a PstI recognition site was created. Specifically, the GAG 
of the created hexanucleotide sequence GTGGAG replaced the neo 
translational start codon. 

The amplified fragment was digested with PstI and the overhanging 3' 
ends were removed by utilising the 3' to 5' exonuclease activity of Tk 
DNA polymerase. The fragment was then ligated with the pMTL 8111 and 
pMTL8ll3 expression vectors which had previously been digested with 
Sspl and StuI and dephosphorylated. Golourless transformants were 
screened for the presence of the neo insert and the correct 
orientation by restriction analysis, and the plasmids obtadLned 
designated pKAN8lll and pKAN8ll3» respectively. Gells of S.cf^r^vi «o 
strain AS33 carrying either plasmid were shorn to be resistant to G4i8 
at levels up to 3 mg/ml, indicative of extremely efficient expression 

of the neo gene. In contrast, only coli cells containing pKAN8ll3 

were able to grow in the presence of G418 (at levels greater than 1 
mg/ml). Lysates prepared from yeast carrying either plasmid cells 
were subjected to SDS-PAGE and the Comassie stained electro 
-phoretograms scanned with a Joyce-Loebell laser densitometer. A 
protein band equating to a size of 30,000 daltons was estimated to 
represent some 5% of the cell's soluble protein. 

Pr im er F . ;^tension Ana l ysis of 5 , cerevisiA^ rnT^^^: In order to 

ascertain the site(s) of transcriptional initiation within the two 
fusion promoters, mRNA was isolated from exponentially growing YEPD 
cultures of S.cereviRiflf^ AS33 containing pKAN8lll and pKAN8ll3. A 25 
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bp Oligonucleotide primer was synthesised. coepleeentary to the cod.n^ 
strand at *53 to *11 within the neo coding region, and purified to 
homogeneity. It was not necessary to consider wild-type chromosomal 
transcription, since the neo gene does not occur chromosomally. 
Primer extension was performed and the products compared wxth 
end-labelled DNA sequence reactions primed with the same olxgo 
-nucleotide primer. 

The results demonstrated that the -RNA transcriptional start point 
(tsp) of the PGK promoter of pKANSlll maps to nucleotide at -42. Thxs 
is one nucleotide further from the AUG than that reported by Van den 
Heuvel et al. (1989) and 2 nucleotides further than that "^e^erm^^ 
by Mellor et al. (1985). Over 90^^ of transcription from the PGK:REP2 
promoter of pKAN8ll3 appeared to initiate at nt -87 at a G residue. 
Thus. REP2 promoter tsp site plays no role in transcription, rather 
factors within the PGK portion of the promoter direct the position and 
pattern of RNA initiation. Rathjen and Mellor (1990) have shown that 
initiation in PGK is reliant on two cis-acting sequences, the TATA 
element at nt -152 and a sequence. 5' -ACAGATCA-3' . located immediately 
5. to the site of RNA called the "determinator" . In the PGK:REP2 

♦->,o fifst "C" of the determinator has been deleted 
promoter, however, the first u w « 

without any apparent effect. 

. P., F rnH 1 rprmrt.i«; * ^ 

.as introduced over the «.thentic tr.«slatlonel poxnt of a PAL 

CDNA clone from niinl1n.rnr1rt11ll. mniloidea (Anson et al.. 1986; Anson 
et al.. 1987; Ras.ussen »>d Oru». 1991) using t>CR-«diated SDM 
(Higuchi et al. .1988) ; an Xbal site lylne 115 bp do^strea. fro. the 
PAL OAO ter.^tion codon. The PAL gene was excised ^ a PstI 
(blunt) /Xbal fragment and cloned into Sspl/Xbal cut p»rL 8131 ^ ^ 
8133 to generate pPAL 8131 and pPAL 8133 respectively. The expression 
of PAL in strain AS33 is shown in Table 3. Ihe lower 

expression levels obtained when cells are grown in rich 
„dia probably reflect a drop in plasmid copy nu.b.r (Rose and Bro^ 

«**«mni-f>f activity and/or increase m mKWA 
1990) , although a decline xn promoter actxvxT;y an / 

turnover cannot be discounted. 

The crude cell-free extracts were analysed by PAGE (Fig 5) and a b«,d 
corresponding to a protein of approximate MW 75kD. which is present 



wo 94/19472 



15 



PCT/GB94/00373 



only in the strain carrying pPAL 8133. was detected. This corresponds 
to the molecular weight of the PAL monomer. The gel was scanned with 
a laser densitometer ( Joyce-Loebell) which calculated that this band 
constitutes approximately 3% of total soxuble cell protein. This 
correlates well with the figure obtained by comparing the specific 
activity of purified PAL at 30*='C with the assay data. This would 
indicate that the vast majority of the recombinant PAL is produced in 
an active form. 

PAL expression levels in E,CQli TGI (Table 3) confirmed the finding 
that the PGK:REP2 promoter is highly active in EL^aali. whilst the 
native PGK promoter is inactive. Deletion of part of the putative 
"-35" region resulted in partial loss of activity of this promoter in 

(data not shown), indicating that it is indeed these signals 
which are activating transcription in E.coli . Quantitative scanning 
of polyacrylamide gels indicated PAL expression levels to be of the 
order of 10% total soluble cell protein. - 

MAT E R I ALS ANP MRTHnPf^T A.l Strains, Plasmids. Transformation and 
Media. 

The StCereviRiflP strain AS33 (a. his3-ll, his3-15. leu2-3, leu2-112. 
ura3-251. ura3"373, trpl) was used throughout. Yeast were transformed 
by electroporation (Becker and Guarente. I991) and transformants 
selected by their ability to complement the appropriate auxotrophic 
allele. E,CQ li strain TGI (Carter et al., 1985) was used as host for 
all DNA manipxilations and bacterial expression studies. Plasmid 
pVTlOO-U (Vemet et al,, I987) was a kind gift from Dr. T. Vemet 
and plasmid pCM4 (Close and Rodriguez, I982) obtained from Pharmacia. 
All DNA manipulations were carried out essentially as described in 
Sambrook et al. (I989). Polymerase chain reaction (PGR) was carried 
out on a programmable thermal cycler using Taq DNA polymerase 
(Amplitaq. Perkin-Elmer Cetus) . DNA sequencing was based on the 
modified chain termination procedure described by Tabor and Richardson 
(I987). Oligos were synthesised using an Applied Biosys terns 38OA DNA 
synthesiser. 

Site-directed mutagenesis (SDM) was performed by a number of 
techniques- Initially, mutants were created using a derivation of the 
method described by Carter et al., (I985). Subsequently, SDM was 
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performed by . -ethod bi^^ on that described by Ku.^el (19«5) ■ 

Latter mutagenesis experiments were carried out using a novel 
coupled-primer method for SDH. Essentially, a PCR product was 
generated using kinased oligos. one of which contained the mutagenic 
mis-match, whilst the other was located at a point on the target 
plasmld such that a restriction site, which was unique in the plasmid. 
lay between the two primers. This FOR product was mixed with 
equimolar a«.unt of target plasmid DNA. which had been passaged 
through an dut ung strain, and linearised at the unique 

restriction site. The DNA mixture was denatured at 65"0 for 5 mi" - 
denaturing buffer (0.2 M NaOH. 0.2 mH EDTA) . before neutralisatxon (2 
M NH.AC. PH 4.5) and subseouent ethanol precipitation The DNA 
redissolved in annealing buffer (20 mM Tris-HCl. pH 7.-; 2 m« MgCl,. 
50 mM NaCl) and »mealed for 15 min at 37-0. Extension reactions -ere 
at 37-C for 1 hr in a buffer containing Ix T« buffer. 5 -M DTT. 500 ^ 
dOTPs, 250 UH rATP. 2 units t4 DNA ligase and 10 units Seouenase. 
Aliquots of this reaction were then transformed into TCI. 

Typical mutagenesis fre,u«.cies were in the region of 30*. This 
Tchnique Obviates the need for sub-cloning into specialised vectora^ 
or the use of repair-deficient strains. Assay for -"'"^ ^ 
levels in cell-free extracts were ass«r.d by the method of Abell ami 
Shen (1987). The production of cinnamic acid can be •^^'^ 
spectrophotometrically at 290 nm. 0.67 ml aistilled J 
assay buffer (500 mM Tris-HCl pH 8.5) »nd 0.17 ml L-phenylal«m,e (50 
71 too m« L-KCl PH 8.5) were combined in a 1 ml cuvette (Hughes 
and Hughes Ltd.. DV r««e) . The cuvette «.d it, contents were 
pre-.ar.ed to 30-0 «.d placed in a Perkin-Elmer Lambda 2 
spectrophotometer. 25 pl of crude cell extract was added and the 
absorbance at 290 nm was monitored for 30 seconds at 30 C. 

one unit of enzyme was defined as the amount catalysing °^ ' 

pmol cinnamic acid per minute under the assay conditions used^ The 
^lar absorption coefficient for cinnimate at 290 n. 30 0 . pH 8.5 
(E, ) was taken as 9 X 10' litre/.ol/cm (Abell and Shen. 1987). The 
level Of PAL activity can then be calculated as follows: 
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E 290 X ]il of sample 
1000 

Protein concentrations were determined by the method of Bradford 
(1976). 

Derivation of thp Expression Cassettf*; The initial sta^s involved ir 
construction were common to each cassette. Two mutagenic 
oligonucleotides were employed to PGR amplify a 4lO bp fragment of 
pMTL23 encompassing lacZ' and the lac po region (Chambers et al., 
1988). The resultant modified fragment possessed a Sspl site at 
position -106 (relative to the lacZ* translational start codon) and a 
Hpal site at nucleotide position +293 (relative to the lacZ* start 
codon) . The transcriptional termination signals of the PGK were 
cloned from Sa — cerevigiae strain LL20 chromosomal DNA as a 373 bp 
Bglll/ Hindlll fragment into M13mtl20 (Chambers et al., I988) . The 
restriction enzyme recognition sites for Glal and Sspl were eliminated 
by SDM. and the DNA reisolated as a Bglll/ Hindlll fragment. The 3* 
end of the ADHl locus was sub-cloned from pVTlOO-U (Vemet et al., 
1987) as a 335 bp Sphl/Hindlll fragment into similarly cleaved 
M13mpl8. An Accl recognition site removed by SDM. and the region 
carrying the desired transcriptional termination signals reisolated as 
a 206 bp HincIIl/SphI fragment. The three DNA fragments specifying 
lacZ', the PGK transcriptional terminator and the ADHl transcriptional 
terminator were then fused, by ligation with DNA ligase, in the order 
and orientation shown in SEQ ID No 1 and 2. Prior to fusion, the 
staggered ends of the DNA fragment encompassing the PGK 
transcriptional terminator (those generated by cleavage with Bglll and 
Hindlll) were blunt-ended by treatment with T4 DNA polymerase. 

To complete the control cassette, a 3.I kb Hindlll fragment carrying 
the PGK gene of 5. cerevisiae strain LL20 was inserted into M13mp8 
and SDM employed to create restriction recognition sites for EcoRI and 
Sspl. In the case of the Sspl recognition site, its position was 
such that the ATG triplet corresponding to the translational start 
codon of the PGK structural gene became the ATA of the Sspl site, 
AATATT. A 766 bp fragment encomp«issing the transcriptional signals of 
PGK was then isolated from the resultant mutagenic Ml 3 clone, 
MI3PGK-J, following cleavage with EcoRI and Sspl, and ligated to the 
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999 bp Sspl/ SphI fragment composed of lacZ ■ : : PGK : : ADHl . such that the 
Sspl recognition site was retained. 

To complete the expression cassette containing the hybrid promoter, a 
1 8 kb Hindlll fragment (nucleotides 4621 to 32 of the sequence of 
Hartley and Donelson. 1980) carrying the promoter of the 2 ym plasmxd 
REP2 gene was subcloned into the equivalent site of M13mp8. 
Recognition sites for the restriction enzymes AccI and Sspl were then 
created in the sequence by SDM. This was achieved by changing the 
hexanucleotide sequences GrrrCTIT and AATGGA (respective nucleotide 
positions 5288 to 5283 and 5199 to 519^; Hartley and Donelson. 1980) 
to GTCGAC and AATATT. respectively. Additionally, two "G" nucleotides 
(positions 557 and 580 in SEQ ID No 2) were both changed to "T". The 
recombinant plasmid obtained was designated M13REP2-J. An additional 
recognition site for the restriction enzyme Clal was also created 
within the PGK derived region of M13PGK-J. T^ie changes made are 
detailed above in the section on features- of SEQ ID No 2. at positions 
725 and 727- The transcriptional signals of PGK were then isolated as 
a 540 bp Xmni/ Clal fragment, and ligated to a 90 bp AccI/ Sspl 
fragment isolated from M13REP2-J. such that fusion occurred between 
the compatible Clal and AccI derived DNA sticky ends. The resultant 
630 bp fragment was then ligated to the 999 kb Sspl/ SphI fragment 
composed of lacZ' : :PGK: :ADH1. such that the Sspl recognition site was 
retained. 

Nucleotide sequence analysis of the various components of the 
constructed cassettes indicated the presence of nucleotide differences 
to previously published sequences, presumably a consequence of strain 
variation. Specifically, several base differences were observed 
between the transcriptional initiation and termination regions of the 
PGK gene used here and that determined by Hitzeman et al. (1982). By 
reference to SEQ ID No 2. the Hitzeman et al. (1982) sequence has 5 
"A" nucleotides rather than the 4 beginning at position 76O. lacks the 
-G" at position 729. has an extra "A" between nucleotides 1399 and 
1400. and an extra "T" nucleotides between position l493 and l494. ^ 
Similarly, the "A" nucleotide at position 1663 was found to be a G 
in the ADHl gene determined by Bennetzen and Hall (1985). 

Two additional nucleotide mutations occurred during the construction 
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of the expression cassette containing the hybrid promoter, around the 
junction point between the PGK promoter and the REP2 leader region. 
Thus, a "C" nucleotide base has been deleted from between positions 
538 and 539 in SEQ ID No 1 (the "C" at position 716 in SEQ ID No 2), 
and the nucleotide base at position 5^3 has become an "A", rather than 
the "C" found in the equivalent position of the strain LL20 PGK 
promoter (position 721 of SEQ ID No 2). 

Sxamp l ft ^; — Derivation of Fi, — qqUL^ cerevisiAA shntti^ v^n^nr^c; 

Provision of K. CQli maintenance and replication functions: the 

first stage in the construction of the new E , coli/S , cerevisiaR vectors 
was to combine the replicative functions of an E, coli plasmid with 
that of a Su, — cerevisiae plasmid. Two basic vectors were made, 
pMTLSOOO and pMTLSlOO. As shown in Figure 6, both were constructed by 
isolating a 1.4 kb Rsal. which encompassed the origin of replication 
and STB locus of the 2 ym plasmid. from plasmid pVTlOO-U (Vemet et 
al.. 1987), and inserting it into the unique EcoRV sites of either 
pMTLJ or pMTLCJ to give pMTLSOOO or pMTLSlOO. respectively. 

Plasmid pMTLJ was derived from pMTL4 (Chambers et al,, I988) . by 
eliminating the recognition site for the restriction enzyme Sspl using 
the plasmid SDM method. The steps involved in the derivation of 
pMTLCJ are shown in Figure 7- Essentially, a 0.8 kb BamHI fragment, 
encoding cat. was excised from plasmid pCM4 (Close and Rodriguez, 
1982) and inserted into the BamHI site of M13mp8, The ssDNA prepared 
from the resultant recombinant was then used as a template in 
successive SDM experiments to eliminate restriction enzyme recognition 
sites for EcoRI. Ncol and Sspl from the cat structural gene, ds DNA 
of the mutated MI3 recombinant was then prepared, the modified cat 
gene excised as a 0.8 kb BamHI fragment, blunt-ended by treatment with 
DNA polymerase I Klenow fragment and ligated to a 1.1 kb Sspl/Dral 
fragment encompassing the replication region of plasmid pWTLk to give 
pMTLCJ. 

The nucleotide sequences of pMTL8000 and pMTL8lOO are shown as SEQ ID 
No 3 and 4. The 2 pm replication region resides between nucleotides 
315^ to 3376 of PMTL8OOO and 3OO3 to 3225 of pMTLSlOO. The STB locus 
is between nucleotides 2526 to 2817 of pMTL8000 and between 2375 and 
2666 of pMTLSlOO, The bla structural gene begins at nucleotide 444 of 
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PMTL8000 and ends at position 1304. The cat structural gene of 
pJTTLSlOO begins at nucleotide 461 and ends at position 111?- In both 
cases, the amino acid sequence of the encoded proteins are shovm below 
the first nucleotide of the corresponding codon in the single letter 
code. The ColEl origin of replication lies at nucleotides 2063-2068 
and 1912-1917 in pJTTLSOOO and pMTLSlOO, respectively. 

P^xr-i^-inn nf n i f'-'-°^° ^"^ ni»smid Relecl-lon in S . rftrftvis i ae; The 
basic backbone of the vector series was completed by inserting DNA 
sequence elements into pMTLSOOO and pKTLSlOO which allowed direct 
selection of the described plasmid series in appropriate auxotrophic 
S n^T^islae host strains. Two different selective markers were 
employed. 

Firstly, a 1.17 kb Bglll fragment containing the f> . nPirftVisiae URA3 
gene was sub-cloned from pVTlOO-U into the BamHI site of M13mp8. The 
ssDNA prepared from, the resultant recombinant was then used as a 
template in successive SDM experiments designed to eliminate unique 
restriction enzyme recognition sites for Ndel. Ncol. and Stul. and two 
AccI restriction sites. This modified gene was designated the URA3-J 
allele. The complete sequence of the DNA fragment actually inserted 
into the eventual expression vectors (see below) is shown as SEQ ID No 
5. The URA3 structural gene initiates at nucleotide 23^ and 
terminates at nucleotide 1034. The amino acid sequence of the encoded 
protein is shown in the single letter code below the first nucleotide 
of the corresponding codon. 

In addition to the standard URA3 selectable marker, a promoterless 
version. ura3-d was also created. SDM was employed to create a Hpal 
site at nt -47 (relative to the AUG start codon) in the URA3-J allele. 
This equates to changing the "C" nucleotide at position 189 to a "G". 
Subsequent excision of the gene by cleaving with Hpal at this point 
removes all sequences necessary for activation of the URA3 gene (Roy 
et al. , 1990) . whilst retaining the major transcriptional start points 
at nt -38 and -33 (Rose and Botstein. 1983) • It was anticipated that 
plasmids endowed with ura3-d would possess elevated plasmid copy 
number under selective conditions, as observed with plasmids carrying 
an equivalent promoterless LEU2 gene. Ieu2-d (Ecrhart and Hollenberg. 

1983) . 
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The second selectable marker used was the LEU2 gene. This was 
sub-cloned as a 1.46 kb Sspl fragment from pMA300 (Montiel et al., 
1984) into the Smal site of pUC8. This fragment lacks the sequences 
mapped as the UAS of LEU2 at -201 to -18? (lu and Casadaban, 1990). 
and disrupts the sequence upstream from LEU2 which codes for a 
putative regulatory peptide (Andreadis et al. , 1982). However, it 
retains the TATA-like AT-rich sequence between bases -118 to -111 that 
has been proposed as a site for the yeast TATA-binding factor TFIID 
(Tu and Casadaban, 1990). The recombinant, pUC8-derived plasmid 
carrying LEU2 was used as a template in SDM experiments to remove the 
recognition sites for the restriction enzymes Clal and EcoRI. In the 
sequence shown as SEQ ID No 5 the URA3 structtiral gene initiates at 
nucleotide 234 and terminates at nucleotide 1034. The amino acid 
sequence of the encoded protein is shown in the single letter code 
below the first nucleotide of the corresponding codon. 

To insert the three alleles URA3-J. ura3-dJ and leu2-dJ into the 
unique pMTL 8000 and pMTL 8100, each allele was excised from the 
appropriate plasmid and converted, where necessary, to a blunt-ended 
DNA fragment. In the case of URA3-J, plasmid pURA3-J was cleaved with 
AccI (cleaving at a site within the pUC8 polylinker region) and Smal 
(cleaving at a Smal site residing some 79 nucleotides 3' to the 
translational stop codon of URA3) and the released c. 1.1 kb fragment 
carrying URA3 treated with t4 DNA polymerase. The exact sequence of 
the blunt-ended fragment generated is shown in SEQ ID No 5. A c. 0.92 
kb blunt-ended fragment carrying the ura3-dJ allele was obtained by 
cleaving plasmid pURA3-dJ with Hpal and Smal. 

The nucleotide sequence of the fragment obtained .exactly corresponds 
to the sequence shown in SEQ ID No 5 between nucleotide 192 and 1115, 
inclusive. Plasmid pLEU2-dJ was cleaved with EcoRI (at the 
recognition site within the pUC8 polylinker region) and AccI (at a 
recognition site located 100 nucleotides 3' to the translational stop 
of URA3. The exact sequence of the blunt-ended fragment generated is 
shown in SEQ ID No 6. 

All three isolated fragments carrying URA3-J. ura3-dJ and leu2-dJ were 
inserted into the unique Hpal site of both pMTLSoOO and pMTLSlOO. 
With one exception, all the recombinant plasmids obtained no longer 
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contained Hpal sites. THe exceptions we^ the ptm^OOO and pMTLSlOO 
derivatives carrying ura3-dJ. where the Hpal site is retained at the 
junction point lying 5' end to the gene. To avoid compromising the 
segregational stability of the plasmids by potential read-through from 
the selective markers into STB (Murray and Cesareni. 1986). clones 
were orientated such that the yeast selective markers transcribed away 
from the STB locus. For comparative purposes, a plasmid containing 
the leu2-dJ allele transcribing towards STB were also constructed. 

'phv«^^«^ cha rnrtP--'°"^<"" OnnstrurtfT l Vftctors; Before 
proceeding to insert the expression cassette into the vector series, 
the basic backbone vectors were assessed with regard to their 
stability (segregational and structural) and copy number. 

r f r^— -•'^ ^^^.»rinr^B^ fltflbTlitv in f> rPTTVifflae; 

Plasmid segregational stability was estimated using methodology 
described by Spalding and Tuite (1989). This involved following the 
loss of a plasmid-encoded phenotypic marker over a number of 
generations under non-selective conditions. The results are presented 
in Table 1. All plasmids exhibited a greater degree of segregational 
stability than that of the well characterised R . r f rftViSiae cloning 
vector YEp24 (Botstein et al-, 1979)- 

u . nr . rmrt— -^»>H^i^tv: The structural stability of 

plasmids in .oT^.-.-.iae was assessed by transforming each plasmxd 
into strain AS33. growing cells for approximately 30 generations under 
selective conditions, and then transferring each plasmid back to 
CQli by the procedure of Hoffman and Winston (1987) - Plasmid DNA was 
then prepared, by the method of Holmes and Quigley (1981). from the 
resultant K. .nli transformants and subjected to restriction enzyme 
analysis. The restriction patterns obtained with all such plasmids 
isolated from EL_Cflli. using the enzymes Sspl and EcoRV. was 
identical to that of the CsCl-purified DNA originally transformed into 
Strain AS33- 

F -^ r - -^.rr.^^ ronv m mberL Plasmid copy number determination 
was based on the non-isotopic technique of Futcher and Cox (1984) . 
Approximately 5 Ug of total yeast DNA was digested simultaneously with 
EcoRI and EcoRV. Following agarose gel electrophoresis, a negative 



wo 94/19472 



23 



PCT/GB94/00373 



ima^ of the restriction "spectrum" was scanned using a laser 
densitometer ( Joyce-Loebell) . The intensity of the band corresponding 
to plasmid DNA was compared with that of the 2,8 kb rDNA EcoRI 
fragment. The rDNA was assumed to be present at 1^0 tandem copies 
(Philipssen et al,, 1991). Plasmid copy number was then calculated as 
f ollows : 



Plasmid Copy Number = Area under plasmi d oRflk- x 2^ x 14Q 

Area under rDNA 2,8kb peak plasmid size (kb) 

Using this method, the copy numbers of the basic plasmid vectors in 
StCerevlsiflfi were compared to previously characterised high copy 
number (pMA3a; Spalding and Tuite, I989) and low copy number {YEpZk; 
Botstein et al.. 1979) plasmids. The results in Table 1 confirm that 
low copy number (pMTL 8120) and high copy number (pMTL 81IO, 813O and 
8l40) versions, of the vectors described in the present invention, 
have been constructed. 

Table 1 Segregational stability and copy number analysis of the pMTL 
81x0 series of vectors. 



Plasmid 


Cells conts- * 
Plasmid {%) 


Plasmid loss/ 
cell div(10-2) 


Average copy 
number/cell 


pMTL 8110 




0.842 


111 


pMTL 8120 


77.5 


1.174 


50 


pMTL 8130 


82.0 


0.992 


151 


pHTL SltiO 


85.5 


0.783 


106 


YEp24(URA3) 


76.0 


1.372 


48 


pMA3a{leu2-d) 


ND 


ND 


106 



After 20 generations of non-selective exponential growth. 
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Segi-egational stability was performed using methodology described by 
Spalding and Tuite (1989) and is an average of two or more independent 
experiments. Copy number data is for cells grown in minimal medxa and 
is based on the assumption that all cells contain plasmid under these 
conditions. The selective marker present within each vector is shown 
in brackets. R = reverse orientation. ND = not determined. 

Table 2 Non-unique restriction sites present within the polylinkers of 
the pWTL 8XXX series of vectors. 



Marker 



PGK 



No Promoter 



PGK:REP2 



leu2-d 
URA3 
ura3-d 
leu2-d 



EcoRV.Kpnl.Sstl 
EcoRV 
EcoRV.Sstl 
EcoRV.Kpnl.Sstl 



EcoRV. Kpnl 

EcoRV 

EcoRV 

EcoRV. Kpnl 



EcoRV.Kpnl.Sstl 
EcoRV 
EcoRV.Sstl 
EcoRV.Kpnl.Sstl 



Table 3 Expression of PAL in f> , rprftvis i ae . AS33 and E^SQlL TGI, 
Figures refer to units xlO-^mg soluble protein. At least three 
separate assays were performed for each sample and the maximum error 



Strain and 
growth phase 


pWTL 8130 


pPAL 8133 


pPAL 8131 


S.cerevis'AS33 
Minimal media 


0 


35.5 ± 2 


18.1 ± 2 


Stationary 








S.cerevis'AS33 
YEPD 


0 


37.8 ± 3 


ND 


Early exponent' 








S.cerevis'AS33 
YEPD 


0 


16.5 ± 1 


8.5 ± 0.7 


Stationary 








E.coliTGl 
2xYr 


0 


35.2 ± 2 


0 


Stationary 
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1) GENERAL INFORMATION: 

"^M^S^; THE PUBLIC HEALTH LABORATORY SERVICE BOARD 

(B) STREET: 6l COLINDALE AVENUE 

(C) CITY: LONDON 

(E) COUNTRY: UNITED KINGDOM {C3B} 

(F) POSTAL CODE (ZIP) : NW9 5DF 

(A) NAME: NIGEL PETER MINTON 

(B) STREET: 27 MOBERLY ROAD 

(C) CITY: SALISBURY 

(D) STATE: WILTSHIRE 

(E) COUNTRY: UNITED KINGDOM (GB) 

(F) POSTAL CODE (ZIP) : SPl 3BZ 

^A^ NAME- JAMES DUNCAN BRUCE FAULKNER 

(B) SSr: ^BISHOPS COURT. JOHN GARNE WAY 

(C) CITY: MARSTON. OXFORD 

(D) STATE: OXFORDSHIRE 

(E) COUNTRY: UNITED KINGDOM (GB) 

(F) POSTAL CODE (ZIP) : 0X3 OTX 

(ii) TITLE OF INVENTION: BIFUNCTIONAL EXPRESSION VECTOR 

(iii) NUMBER OF SEQUENCES: 6 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: PatentIn Release gl.O. Version ei.25 (EPO) 
(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: l6l9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: . . 

(A) ORGANISM: Saccharomyces cerevisxae 

(ix) FEATURE: 

(A) NAME/KEY: misc_recomb 

(B) LOCATION: "5^1 
(ix) FEATURE: 

(A) NAME/KEY: misc_recomb 
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(B) LOCATION: 635.-636 
(ix) FEATURE: 

(A) NAME/KEY: misc_recoinb 

(B) LOCATION: 1035- -1036 
(ix) FEATURE: 

(A) NAME/KEY: miscrecomb 

(B) LOCATION: I4ll..l4l2 
(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 550.. 555 
(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 574.-579 
(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 668.-673 
(ix) FEATURE: 

(A) NAJIE/KEY: misc_feature 

(B) LOCATION: 692.-697 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
GAATTCTTTTC CCTCCTTCTT GAATTGATGT TACCCTCATA AAGCACGTGG CCTCTTATCG 60 
AGAAAGAAAT TACCGTCGCT CCTTGATTrGT TTGCAAAAAG AACAAAACTG AAAAAACCCA 120 
GACACGCTCG ACTTCCTGrrC TTCCTATTGA TTGCAGCTrC CAATTTCGTC ACACAACAAG I80 
GTCCTAGCGA CGGCTCACAG GTnTGTAAC AAGCAATCGA AGGTTCTGGA ATGGCGGGAA 240 
AGGGTTTAGT ACCACATGCT ATGATGCCCA CTGTGATCTC CAGAGCAAAG TTCGTTCGAT 300 
CGTACTGTTA CTCTCTCTCT TTCAAACAGA ATTGTCCGAA TCGTGTGACA ACAACAGCCT 36O 
GTTCTCACAC ACTCTTITCT TCTAACCAAG GGGGTGGnTT AGTITAGTAG AACCTCGTGA 420 
AACTTACATT TACATATATA TAAACTTGCA TAAATTGGTC AATGCAAGAA ATACATATTT 480 
GGTCmrCT AATTCGTAGT TTTTCAAGIT CTTAGATGCT TTCnTTTCT CTTinTAAG 540 
ATAATCGACT TGACATTTGA TCTGCACAGA TTTTATAATT TAATAAGCAA GAATACATTA 6OO 
TCAAACGAAC AATACTGGTA AAAGAAAACC AAAATATTAG TTAGCTCACT CATTAGGCAC 66O 
CCCAGGCTTT ACACTTTATG CTTCCGGCTC GTATGTTGTa TGGAATTGTG AGCGGATAAC 720 
AATTTCACAC AGGAAACAGC TATGACCATG ATTACGCCAA GCTCGCGAGG CCTCGAGATC 78O 
TATCGATGCA TGCCATGGTA CCCGGGAGCT CGAATTCTAG AAGCTTCTGC AGACGCGTCG 840 
ACGTCATATG GATCCGATAT CGCCGGCAAT TCACTGGCCG TC UTl ' lTA CA ACGTCGTGAC 9OO 
TGGGAAAACC CTGGCGTTAC CCAACTTAAT CGCCTTGCAG CACATCCCCC TTTCGCCAGC 960 
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TGGCGTAATA GCGAAGAGGC CCGCACCGAT CGCCCTTCCC AACAGTITGCG TAGCCTGAAT 1020 
GGCGAATGGC GCGITGATCr CCCATdTCTC TACTGCrrGGri GGTGCTrCTr TGGAATTATT IO8O 
GGAAGCTTAAG GAATTGCCAG GrOTTGCTrr CTTATCCGAA AAGAAATAAA TTGAATTGAA ll40 
TTGAAATCCA TAGATCAATT TTITrCTITT CTCTTTCCCC ATCCTTTACG CTAAAATAAT 1200 
AGriTATITr ATrmTGAA TATATirrAT rrATATACOr ATATATAGAC TATTATITAG I26O 
TTTTAATGAT TATTAAGATT TTTATrAAAA AAAAATTCGr CCCTCTinT AATGCCnTT 1320 

ATGCAGTrrr TrrrrcccAT tcgatatitc tatgttcggg ttcagcgtat TrrAAGrrrA 1380 

ATAACTCGAA AATTCTGCGr TCGITAAAGC TGACACTTCT AAATAAGCGA ATTrCTTATG l440 
ATTTATGATr TTTATTATrA AATAAGTTAT AAAAAAAATA AGTITATACA AATTTTAAAG 1500 
TGACTCTTAG GnTTAAAAC GAAAATTCTT ATTCTTGAGT AACTCTTTCC TOrAGGTCAG I56O 

orrGCTrrcr caggtatagc atgaggtcgc tcttattgac cacacctcta ccggcatgc 1619 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 175^ base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: , . 

(A) ORGANISM: Saccharomyces cerevisiae 

(ix) FEATURE: 

(A) NAME/KEY: misc^recomb 

(B) LOCATION: 546-. 5^7 

(ix) FEATURE: 

(A) NAME/KEY: iiiisc_recomb 

(B) LOCATION: 635- -636 

(ix) FEATURE: 

(A) NAME/KEY: misc^recomb 

(B) LOCATION: 1035*. 1036 

(ix) FEATURE: 

(A) NAME/KEY: niisc_recomb 

(B) LOCATION: I4ll..l4l2 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO 2: 
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GAATTCAACT CAAGACGCAC AGATATTATA ACATCTGCAT AATAGGCATT TGCAAGAATT 60 
ACTCGTGAGT AAGGAAAGAG TGAGGAACTA TCGCATACCT GCATTTAAAG ATGCCGATTT 120 
GGGCGCGAAT CCnTATTTT GGCTTCACCC TCATACTATT ATCAGGGCCA GAAAAAGGAA 180 
GTGTTTCCCT CCTTCTTGAA TTGATGTTAC CCTCATAAAG CACGTGGCCT CTTATCGAGA 240 
AAGAAATTAC CGTCGCTCGT GATTTGTTTG CAAAAAGAAC AAAACTGAAA AAACCCAGAC 300 
ACGCTCGACT TCCTGTCTTC CTATTGATTG CAGCTTCCAA TTTCGTCACA CAACAAGGTC 36O 
CTAGCGACGG CTCACAGGTT TTGTAACAAG CAATCGAAGG TTCTGGAATG GCGGGAAAGG H20 
GTTTAGTACC ACATGCTATG ATGCCCACTG TGATCTCCAG AGCAAAGTTC GTTCGATCGT 480 
ACTGTTACTC TCTCTCTITC AAACAGAATT GTCCGAATCG TGTGACAACA ACAGCCTGTT 540 
CTCACACACT CTTrrCTTCT AACCAAGGGG GTGGTTTAGT TTAGTAGAAC CTCCTGAAAC 6OO 
TTACATITAC ATATATATAA ACTTGCATAA ATTGGTCAAT GCAAGAAATA CATATITGGT 660 

cnrrcTAAT TCGTAGrriT tcaagttctt AGATGcnrc TmTCTCTT TmACAGAT 720 

CATCAAGGGA AGTAATTATC TACTITITAC AACAAATATA AAACAATATT AGTTAGCTCA 78O 
CTCATTAGGC ACCCCAGGCT TTACACTITA TGCTTCCGGC TCGTATGTTG TGTGGAATTG 840 
TGAGCGGATA ACAATTTCAC ACAGGAAACA GCTATGACCA TGATTACGCC AAGCTCGCGA 900 
GGCCTCGAGA TCTATCGATG CATGCCATGG TACCCGGGAG CTCGAATTCT AGAAGCTTCT 96O 
GCAGACGCGT CGACCJTCATA TGGATCCGAT ATCGCCGGCA ATTCACTGGC CC?rCGTITrA 1020 
CAACGTCGTG ACTGGGAAAA CCCTGGCGTT ACCCAACTTA ATCGCCTTGC AGCACATCCC IO8O 
CCTTTCGCCA GCTGGCGTAA TAGCGAAGAG GCCCGCACCG ATCGCCCTTTC CCAACAGTTG llHO 
CGTAGCCTGA ATGGCGAATG GCGCGTTGAT CTCCCATGTC TCTACTGGTG GTGGTGCTTC 1200 
TITGGAATrA TTGGAAGGTA AGGAATTGCC AGGTGTTGCT TTCTTATCCG AAAAGAAATA 1260 
AATTGAA-ITG AAITGAAATC CATAGATCAA iTlTlTrC'lT TTCTCTTTCC CCATCCTTTA I32O 
CGCTAAAATA ATAGTTTATT TTATmTTG AATATATTTT ATTTATATAC GTATATATAG I38O 
ACTATTATTT ACTTTTAATG ATTATTAAGA TmTATTAA AAAAAAATTC GTCCCTCTIT 1440 
TTAATGCCTT TTATGCAC7IT TnTTTTCCC ATTCGATATT TCTATGTTCG GGTTCAGCGT I5OO 
AlTTTAAGrr TAATAACTCG AAAATTCTGC GTrCGTITAAA GCTGACACTT CTAAATAAGC I56O 
GAAirrCTTA TGATTTATGA TmTATTAT TAAATAAGTT ATAAAAAAAA TAAGTITATA l620 
CAAATTITAA AGTGACTCTT AGGTriTAAA ACGAAAATTC TTATTCTTGA GTAACTCCTC I68O 
TrrCCTGTAG GTCAGGTTGC TTTCTCAGGT ATAGCATGAG GTCGCTCTTA TTGACCACAC 1740 
CTCTACCGGC ATGC I754 
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2) INFOFIMATION FOR SEQ ID NO: 3 = 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3^00 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Saccharomyces cerevisiae 

(ix) FEATURE: 

(A) NAME/KEY: misc_recomb 

(B) LOCATION: 290. .291 
(ix) FEATURE: 

(A) NAME/KEY: misc_recomb 

(B) LOCATION: 2294.. 2295 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
AATATm-AG TAGCTCGTTA CACirCCGGTG CGrnTITGGr TmTGAAAG TGCGTCTTCA 60 
GAGCGCTTrr GCnrrrCAAA AGCGCTCTGA AGrrCCTATA CTITCTAGCT AGAGAATAGG 120 
AACTTCGGAA TAGGAACTTC AAAGCGTTTC CGAAAACGAG CGCTTCCGAA AATGCAACGC l80 
GAGCTGCGCA CATACAGCTC ACTCHTCACG TCGCACCTAT ATCTGCCTGr TGCCTGTATA 240 
TATATATACA TGAGAAGAAC GGCATAGTGC GrGTITATGC TTAAATGCGT ATCCCGCAAG 300 
AGGCCCGGCA CTCAGCrGGC ACTTITCGGG GAAATCTGCG CGGAACCCCT ATITGITrAT 360 
TTTTCTAAAT ACATTCAAAT ATCTTATCCGC TCATGAGACA ATAACCCTGA TAAATGCTTC 420 
ATTACTATTG AAAAAGGAAG AGTATGACTA TTCAACATIT CCGrTCTCGCC CTTATTCCCT 480 
nTTTGCGGC ATnTGCCTT CCTCmTITG CTCACCCAGA AACGCTGGTG AAAGTAAAAG 540 
ATGCTGAAGA TCACTrGGGT GCACGAGTGG GTTACATCGA ACTGGATCTC AACAGCGGTA 600 
AGATCCTTGA GAGTITrCGC CCCGAAGAAC GTITrCCAAT GATGAGCACT TTTAAAGTrC 660 
TGCTATCTGG CGCGCTATTA TCCCGTATTG ACGCCGGGCA AGAGCAACTC GGTCGCCGCA 720 
TACACTATTC TCAGAATGAC TrGGTITGACTr ACTCACCACT CACAGAAAAG CATCTTACGG 780 
ATGGCATGAC AGTAAGAGAA TTATGCAGTG CTGCCATAAC CATGAGTGAT AACACTGCGG 840 
CCAACTTACT TCTGACAACG ATCGGAGGAC CGAAGGAGCT AACCGCTTTT TTGCACAACA 900 
TGGGGGATCA TCTAACTCGC CTTGATCGIT GGGAACCGGA GCTGAATGAA GCCATACCAA 960 
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ACGACGAGCG 
CTGGCGAACT 
AAGTTGCAGG 
CTGGAGCCGG 
CCTCCCGTAT 
GACAGATCGC 
ACTCATATAT 
AGATCCTTTT 
CGTCAGACCC 
TCTGCTGCTT 
AGCTACCAAC 
TTCTTCTAGT 
ACCTCGCTCT 
CCGGGTTGGA 
GTTCGTGCAC 
GTGAGCATTG 
GCGGCAGGGT 
TTTATAGTCC 
CAGGGGGGCG 
TTTGCTGGCC 
GTATTACCGC 
AGTCAGTGAG 
TCGGAATGGA 
TCCCTGAAAC 
AAGACAATGT 
TGCACGTCGC 
TTGTTAACGA 
TmTCAAAC 
TATTTTACCA 
GCTAATTTIT 



TGAGACCACG 
ACTTACTCTA 
ACCACTTCTG 
TGAGCGTGGG 
CGTAGTTATC 
TGAGATAGGT 
ACTTTAGATT 
TGATAATCTC 
CGTAGAAAAG 
GCAAACAAAA 
TCTTTTTCCG 
GTAGCCGTAG 
GCTAATCCTG 
CTCAAGACGA 
ACAGCCCAGC 
AGAAAGCGCC 
CGGAACAGGA 
TGTCGGGTTT 
GAGCCTATGG 
TTTTGCTCAC 
CTTTGAGTGA 
CGAGGAAGCG 
CGATACTTGT 
AGATAGTATA 
ATGTAnTCG 
ATCCCCGGTT 
AGCATCTGTG 
AAAGAATCTG 
ACGAAGAATC 
CAAACAAAGA 



ATGCCTGTAG 
GCTTCCCGGC 
CGCTCGGCCC 
TCTCGCGGTA 
TACACGACGG 
GCCTCACTGA 
GATTTAAAAC 
ATGACCAAAA 
ATCAAAGGAT 
AAACCACCGC 
AAGGTAACTG 
TTAGGCCACC 
TTACCAGTGG 
TAGTTACCGG 
TTGGAGCGAA 
ACGCTTCCCG 
GAGCGCACGA 
CGCCACCTCT 
AAAAACGCCA 
ATGTTCTTTC 
GCTGATACCG 
GAAGAGCGCT 
TACCCATCAT 
TTTGAACCTG 
GTTCCTGGAG 
GATTTTCTGC 
CTTCATTTTG 
AGCTGCATTT 
TGTGCTTCAT 
ATCTGAGCTG 
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CAATGGCAAC 
AACAATTAAT 
TTCCGGCTGG 
TCATTGCAGC 
GGACTCAGGC 
TTAAGCATTG 
TTCATTnTA 
TCCCTTAACG 
CTTCTTGAGA 
TACCAGCGGT 
GCTTCAGCAG 
ACTTCAAGAA 
CTGCTGCCAG 
ATAAGGCGCA 
CGACCTACAC 
AAGGGAGAAA 
GGGAGCTTCC 
GACTTGAGCG 
GCAACGCGGC 
CTGCGTTATC 
CTCGCCGCAG 
AGCAGCACGC 
TGAATTTTGA 
TATAATAATA 
AAACTATTGC 
GTTTCCATCT 
TAGAACAAAA 
TTACAGAACA 
TTTTGTAAAA 
CATnTTACA 



AACGTTGCGC 
AGACTGGATG 
CTGGTTTATT 
ACTGGGGCCA 
AACTATGGAT 
GTAACTGTCA 
ATTTAAAAGG 
TGAGTTTTCG 
TCCTTTTTTT 
GGTTTGnTG 
AGCGCAGATA 
CTCTGTAGCA 
TGGCGATAAG 
GCGGTCGGGC 
CGAACTGAGA 
GGCGGACAGG 
AGGGGGAAAC 
TCGATTTITG 
CTTTTTACGG 
CCCTGATTCT 
CCGAACGACC 
CATAGTGACT 
ACATCCGAAC 
TATAGTCTAG 
ATCTATTGCA 
TGCACTTCAA 
ATGCAACGCG 
GAAATGCAAC 
CAAAAATGCA 
GAACAGAAAT 



AAACTATTAA 1020 
GAGGCGGATA 1080 
GCTGATAAAT ll40 
GATGGTAAGG 1200 
GAACGAAATA 1260 
GACCAAGTIT 1320 
ATCTAGGTGA I38O 
TTCCACTGAG l440 
CTGCGCGTAA I5OO 
CCGGATCAAG I56O 
CCAAATACTG l620 
CCGCCTACAT I68O 
TCGTGTCTTA 17^0 
TGAACGGGGG I8OO 
TACCTACAGC i860 
TATCCGGTAA 1920 
GCCTGGTATC I98O 
TGATGCTCGT 20kQ 
TTCCTGGCCT 2100 
CTTGOATAACC 2l60 
GAGCGCAGCG 2220 
GGCGATGCTG 2280 
CTGGGAGTTr 23^0 
CGCTTTACGG 2^00 
TAGGTAATCT 2460 
TAGCATATCT 2520 
AGAGCGCTAA 258O 
GCGAAAGCGC 2640 
ACGCGAGAGC 2700 
GCAACGCGAG 276O 
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.acocTArrr taccaacma c^a^ rrcLrrc ^c.c^ -ocatccco zB.o 

ZZL. — — ~ — — Z 

.MTCCAcrc TcrrcATAAC ™act orAccrccar taaoc^aoa aoaaocctac 29.0 
^cororcT ath^c^ ccataaaaaa accctoactc cac^ccccc ™cat boo 

..CAOCOAA CA™ —CO A.™ ~ - 

CCCATOrOOA ™TAC ~CA CAAAOrCATA cc— Trc^ 3 

1.AAAA. — — ~ — — : : 

^ACArmc cTATrarm coArrcACTC tatoaataot tcttactaca Arrrrrrrcr 32U0 

CAAAOAOTA A— .AAACA.AAA AAA^ ~A CATCCAA^ 33^ 
CAACOACCCA AACOrCCATC OCAC™ ATAOCOATAT ACCACAOAOA TATATAOCAA 33SO 
AGAGATACrr TTGAGCAATG TTTGrrGGAAG CGGTATTCGC 3^00 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32^9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomxc) 
(iii) HYPOTHETICAL: NO 

(iii) ANTI-SENSE: NO 
(vi) ORIGINAL SOURCE: ^^^o^-Lsiae 
(k) ORGANISM: Saccharomyces cerevisiae 

(ix) FEATURE: 

(A) NAME/KEY: Bisc_recomb 

(B) LOCATION: 290.-291 
(ix) FEATURE: 

(A) NAME/KEY: 

(B) LOCATION: 426.-42/ 
(ix) FEATURE: 

(A) NAME/KEY: ""isc.recomb 

(B) LOCATION: 1213- --L^iH 
(ix) FEATURE: 

(A) NAME/KEY: oifC_r«^°^^ 

(B) LOCATION: 2l43.-2l44 

(^) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
AATATTTTAG TAGCTCOTTA CAGTCCGGTG CC^GC TnTTGAAAG TGCOTCTTCA 0 
GOrmCAAA AGCGCTCTGA AOTTCCTATA CTTTCTAGCT AGAGAATAGG 120 
— CGAAAACGAG CGCITCCGAA AATGCAACGC iSO 

TtGCGCA CATACAG.C ACT^CACG TCGCACCTAT ATCTGCCC TGCCTOIATA 240 
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TATATATACA TGAGAAGAAC GGCATAGTGC GTGTrTATGC TTAAATGCGT ATCCCGCAAG 300 
AGGCCCGGCA CTTCAGGTGGC ACTTTTCGGG GAAATGTGCG CGGAACCCCT ATTrGTTTAT 36O 
■nrrCTAAAT ACATTCAAAT ATGTATCCGC TCATGAGACA ATAACCCTGA TAAATGCTTC 420 
AATAATGATC CACGAGATTT CAGGAGCTAA GGAAGCTAAA ATGGAGAAAA AAATCACTGG 480 
ATATACCACC GTTGATATAT CCCAATGGCA TCGTAAAGAA CATrrTGAGG CATTTCAGTC 540 
AGTTGCTCAA TGTACCTATA ACCAGACCGT TCAGCTGGAT ATTACGGCCT TTTTAAAGAC 600 
CGTAAAGAAA AATAAGCACA AGmTATCC GGCCTTTATT CACATTCTTG CCCGCCTGAT 66O 

gaatgctcat ccggagttcc gtatggcaat gaaagacggt gagctggtga tatgggatag 720 

TGTTCACCCT TGTTACACCG TTITCCATGA GCAAACTGAA ACGTTTrCAT CGCTCTGGAG 78O 
TGAATACCAC GACGATTTCC GGCAGTTrCT ACACATATAT TCGCAAGATG TGGCGTGTTA 840 
CGGTGAAAAC CTGGCCTATT TCCCTAAAGG OnTATTGAG AATATGTnT TCGTCTCAGC 900 
CAATCCCTGG GTGAGTITCA CCACmTTGA TTTAAACGTG GCCAATATGG ACAACTTCTT 96O 
CGCCCCCGIT TTCACAATGG GCAAGTATTA TACGCAAGGC GACAAGGTGC TGATGCCGCT 1020 
GGCGATTCAG GTTCATCATG C.CGTTTGTGA TGGCTTCCAT GTCGGCAGAA TGCTTAATGA IO8O 
ATTACAACAG TACTGCGATG AGTGGCAGGG CGGGGCGTAA nTTTTTAAG GCAGTTATrG ll40 
GTGCCCTTAA ACGCCTGGTG CTACGCCTGA ATAAGTGATA ATAAGCGGAT GAATGGCAGA 1200 
AATTCGTCGG ATCAAAAGGA TCTAGGTGAA GATCCmTT GATAATCTCA TGACCAAAAT 1260 
CCCTTAACGT GAGTTTTCGT TCCACTGAGC GTCAGACCCC GTAGAAAAGA TCAAAGGATC 1320 
TTCTTGAGAT CCTnmTG TGCGCGTAAT CTGCTGCTTG CAAACAAAAA AACCACCGCT I38O 
ACCAGCGGTG GTITGTrrGC CGGATCAAGA GCTACCAACT CTnTTCCGA AGGTAACTGG l440 
CTTCAGCAGA GCGCAGATAC CAAATACTGT TCTTCTAGTG TAGCCGTAGT TAGGCCACCA I5OO 
CTTCAAGAAC TCTGTAGCAC CGCCTACATA CCTCGCTCTG CTAATCCTGT TACCAGTGGC I56O 
TGCTGCCAGT GGCGATAAGT CGTGTCTTAC CGGGTrGGAC TCAAGACGAT AGTTACCGGA 1620 
TAAGGCGCAG CGGTCGGGCT GAACGGGGGG TTCGTGCACA CAGCCCAGCT TGGAGCGAAC I68O 
GACCTACACC GAACTGAGAT ACCTACAGCG TGAGCATTGA GAAAGCGCCA CGCTTCCCGA 1740 
AGGGAGAAAG GCGGACAGGT ATCCGGTAAG CGGCAGGGTC GGAACAGGAG AGCGCACGAG I8OO 
GGAGCTTCCA GGGGGAAACG CCTGGTATCT TTATAGTCCT GTCGGGTITC GCCACCTCTG i860 
ACTTGAGCGT CGAlTiTi'GT GATGCTCGTC AGGGGGGCGG AGCCTATGGA AAAACGCCAG 1920 
CAACGCGGCC TTTTTACGGT TCCTGGCCTT TTGCTGGCCT TTTGCTCACA TGTTCTTTCC I98O 
TGCGTTATCC CCTGATTCTG TGGATAACCG TATTACCGCC TTTGAGTGAG CTGATACCGC 2040 
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TCGCCGCAGC CGAACGACCG AGCGCAGCGA GTCAGTGAGC GAGGAAGCGG AAGAGCGCTA 2100 
GCAGCACGCC ATACTTGACTG GCGATGCTGT CGGAATGGAC GATACTTGTr ACCCATCATT 2l60 
GAATTTTGAA CATCCGAACC TGGGAGTITr CCCTGAAACA GATAGTATAT TTGAACCTGT 2220 
ATAATAATAT ATACTCTAGC GCTTTACGGA AGACAATGTA TCTATITCGG TTCCTGGAGA 2280 
AACTATTGCA TCTATTGCAT AGGTAATCTT GCACCTCGCA TCCCCGGrrC ATTITCTGCG 2340 
TrrCCATCTT GCACTTCAAT AGCATATCTT TGITAACGAA GCATCTCrTGC TrCATnTGT 2400 
AGAACAAAAA TGCAACGCGA GAGCGCTAAT TTTTCAAACA AAGAATCTGA GCTGCATTIT 2460 
TACAGAACAG AAATGCAACG CGAAAGCGCT ATTTTACCAA CGAAGAATCT GTGCTTCATr 2520 
TTTGrTAAAAC AAAAATGCAA CGCGAGAGCG CTAATITITC AAACAAAGAA TCTGAGCTGC 2580 
ATinTACAG AACAGAAATG CAACGCGAGA GGGCTATTTT ACCAACAAAG AATCTATACT 2640 
TCTmrrGT TCTACAAAAA TGCATCCCGA GAGCGGTATT TTTCTAACAA AGCATCTTAG 2700 
ATTACTmT TTCTCCTITG TGCGCTCTAT AATGCACTCT CTTGATAACT TTTTGCACTG 2760 
TAGGTCCCJrr AAGGITAGAA GAAGGCTACT TTGCrrGTCTA TrrrCTCTTC CATAAAAAAA 2820 

gcctgacttc acttcccgcg tttactgatt actagcgaag ctgcggcttgc ATrrrrrcAA 2880 

GATAAAGGCA TCCCCGATTA TATTCTATAC CGATCTGGAT TGCGCATACT TTCrrGAAGAG 2940 
AAACTGATAG CGTTGATGAT TCTrCATTGG TCAGAAAATT ATGAACGCTIT TCTTCTATTT 3000 
TCTCTCTATA TACTACGTAT AGGAAATGTT TACATirrCG TATTGrnTrC GATTCACTCT 3060 
ATGAATACTTT CTTACTACAA ■ mTlTl GTC TAAAGAGTTAA TACTAGAGAT AAACATAAAA 3120 
AATGTAGAGG TCGAOTTTAG ATGCAAGriTC AAGGAGCGAA AGGTGGATGG GTAGGTTATA 3l80 
TAGGGATATA GCACAGAGAT ATATAGCAAA GAGATACTTT TGAGCAATCT TTGTGGAAGC 3240 
GCrrATTCGC 3249 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1115 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: , , 

(A) ORGANISM: Saccharonyces cerevisiae 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
TCGACGGATC TGCCTTTrCA ATTCAATTCA TCATrmTT TrrATTCTIT TnTTGATTr 60 
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CGGmTCCTT GAAATTTTTT TGATTCGCJTA ATCTCCGAAC AGAAGGAAGA ACGAAGGAAG 120 
GAGCACGATT TrrGCATGGrr ATATATACGG ATATGTAGTG TTGAAGAAAC ATGAAATTGC I80 
CCAGTATTCT TAACCCAACT GCACAGAACA AAAACCGGAA ACGAAGATAA ATCATGTCGA 240 
AAGCTACATA TAAGGAACGT GCTGCTACTC ATCCTAGTCC TGTTGCTGCC AAGCTATTTA 300 
ATATCATGCA CGAAAAGCAA ACAAACTTGT GTGCTTCATT GGATGTTCGT ACCACCAAGG 36O 
AATTACTGGA GTTACnTGAA GCATTAGGTC CCAAAATTTG TTTACTAAAA ACACATCTGG 420 
ATATCTTGAC TGATnTTCG ATGGAGGGCA CAGriTAAGCC GCTAAAGGCA TTATCCGCCA 480 
AGTACAATTT TITACTCTrC GAAGACAGAA AATTTGCTGA CATTGGTAAT ACAGTCAAAT 540 
TGCACTTACTC TGCGGCTCTC TATAGAATAG CAGAATGGGC AGACATTACG AATGCACACG 6OO 
GTGTGCTGGG CCCAGGTATT GITAGCGGTr TGAAGCAGGC GGCAGAAGAA GTAACAAAGG 66O 
AACCTAGAGG ACTTTTGATG TTAGCAGAAT TGTCATGCAA GGGCTCCCTA TCTACTGGAG 720 
AATATACTAA GGGTACTGrTT GACATTGCGA AGAGCGACAA AGAnTTGTT ATCGGCTTTA 78O 
TTGCTCAAAG AGACATGGGT GGAAGAGATG AAGGTTACGA TrGGITGATT ATGACACCCG 840 
CTrGTGGGirr AGATGACAAG GGAGACGCAT TGGGTCAACA GTATAGAACC CTGGATGATG 900 
TGGTCTCTAC AGGATCTGAC ATTATTATTG TTGGAAGAGG ACTATTTGCA AAGGGAAGGG 96O 
ATGCTAAGGT AGAGGGTGAA CGTTACAGAA AAGCAGGCTG GGAAGCATAT TTGAGAAGAT 1020 
GCGGCCAGCA AAACTAAAAA ACTGTATTAT AAGTAAATGC ATCTATACTA AACTCACAAA 1080 

ttagagcttc aatttaatta tatcagttat taccc 1115 

(2) information for SEQ id NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1334 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iii) ANTI-SENSE: NO 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Saccharomyces cerevisiae 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

AATPCCCATT ATITAAGGAC CTATTGnTT TTCCAATAGG TGGTTAGCAA TCGTCTTACT 60 
TTCTAACTTT TCTTACCTTT TACATTTCAG CAATATATAT ATATATTTCA AGGATATACC 120 
ATTCTAATGT CTGCCCCTAT GTCTGCCCCT AAGAAGATCG TCGmTGCC AGGTGACCAC 180 
GTTGGTCAAG AAATCACAGC CGAAGCCATT AAGGTTCTTA AAGCTATTTC TGATCTITCCTr 240 
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TCCAATCTCA ACTrCGATTT CGAAAATCAT TTAATrGGTG GTGCTGCTAT CGACGCTACA 300 

GCTcrrcccAC ttccagatga ggcgctggaa gcctccaaga AGcrrrGATGC ccmrrGTrA 360 

GCrrGCTCTGG CTGCrCCTAA ATGGGGTACC GGTAGTGTTA GACCTGAACA AGCnTTACTA 420 
AAAATCCCTA AAGAACTTCA ATrCTTACGCC AACTTAAGAC CATGTAACTT TGCATCCGAC 480 
TCTCrrrrAG ACTTATCTCC AATCAAGCCA CAATITGCTA AAGCrrACTGA CTrCGTTGTr 540 
GTCAGAGAAT TACTGGGAGG TATTTACTrr GCTAAGAGAA AGGAAGACGA TGCTGATGCrr 600 
CTCGCTTGGG ATACTGAACA ATACACCGriT CCAGAAGTGC AAAGAATCAC AAGAATGGCC 660 
GcrrrcATGG CCCTACAACA TGAGCCACCA TTGCCTATTT GGTCCTTGGA TAAAGCTAAT 720 
GrmTGGCCT CTTCAAGATT ATGGAGAAAA ACTGTGGAGG AAACCATCAA GAACGAATTT 780 
CCTACATTGA AGGTTCAACA TCAATTGATT GATTCTGCCG CCATGATCCT AGTTAAGAAC 840 
CCAACCCACC TAAATGGTAT TATAATCACC AGCAACATGT TrGCTGATAT CATCTCCGAT 900 
GAAGCCTCCG TTATCCCAGG TTCCTrGGCT TrGITGCCAT CTGCGTCCTT GGCCTCTTTG 960 
CCAGACAAGA ACACCGCATT TGGTrTGTAC GAACCATGCC ACGGTTCTGC TCCAGATTTG 1020 
CCAAAGAATA AGGITGACCC TATCGCCACT ATCTTCrrCTG CTGCAATGAT GTrGAAATTG 1080 
TCATTGAACT TGCCTGAAGA AGOTAAGGCC ATTGAAGATG CACTTAAAAA GGTriTGGAT ll40 
GCAGGTATCA GAACTGCTTGA TTTAGGrGGrT TCCAACAGTA CCACCGAACT CGGrTGATGCT 1200 
OTCGCCGAAG AAGITAAGAA AATCCTTGCT TAAAAAGATT CTCriTlTil ATGATATTTG 1260 
TACATAAACT TTATAAATGA AATTCATAAT AGAAACGACA CGAAATTACA AAATGGAATA 1320 
TGTTCATAGG GTAG 1334 



wo 94/19472 



37 



PCT/GB94/00373 



1. Promoter DNA incorporating a structural gene starting position 
characterised in that the DNA has a tinique Sspl restriction site at 
the structural gene start position. 

2. A method for producing a DNA as claimed in claim 1 comprising 
subjecting promoter DNA to Site-Directed Mutagenisis to create a 
unique Sspl restriction site at the structural gene start position. 

3- A method as claimed in claim 2 wherein the position of the created 
site is such that the triplet sequence, ATG. corresponding to the 
translations! start codon of the structural gene becomes ATA within 
the Sspl recognition site AATATT. 

A method as claimed in claim 2 or 3 wherein the heterologous gene 
to be inserted is similarly modified wherein the nucleotide triplet 
corresponding to the translational start codon is changed to CAG, 
while the triplet immediately 5* is changed to CTG in order to create 
a PstI restriction site, CTGCAG. 

5. A method as claimed in claim 4 wherein the creation of the PstI, 
or equivalent site, is performed simultaneously to isolation of the 
gene by utilising a mutagenic primer in a polymerase chain reaction 
(PGR) catalysed gene amplification procedure. 

6. A method as claimed in claim 4 or 5 wherein the heterologous gene 
is digested with PstI restriction endonuclease and the 3' overhanging 
ends removed by the 3* to 5' exo-nucleolytic activity of T4 DNA 
polymerase, the gene then excised using one or more of the restriction 
enzymes whose sites are present within the polylinker of the vector 
whereby the first base of the blunt-ended DNA fragment is the third 
nucleotide, '^G", of its first codon and the gene DNA is then ligated 
into the vector which has been digested previously with Sspl and a 
restriction enzyme compatible with that used to excise the 
heterologous gene, whereby fusion of the vector promoter region 
(which ends in "AT**) and heterologous gene (which begins in a "G") 
results in the recreation of the translational start, ATG. 
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7. 



, Recombinant DNA comprising a yeast promoter sequence characterized 
in that the leader region of the promoter sequence is replaced with 
leader sequence of the replication protein 2 (REP2) gene (ORF C) of 
the yeast 2 pm plasmid. 

8 Recombinant DNA as claimed in claim 7 wherein the yeast promoter 
derived portion is that of the phosphoglycerate kinase (PGK) promoter. 

9. Recombinant DNA as claimed in claim 7 or 8 wherein the upstream 
activating sequence element and TATA-box are those as found in 

the PGK promoter. 

10. Recombinant DNA as claimed in any one of claims 7 to 9 wherein 
the UAS element and TATA-box are fused to the 86 nucleotides residing 
immediately 5' to the Z\m plasmid REP2 gene. 

11. Recombinant DNA comprising a sequence of bases 1 to 635 of SEQ ID 1, 

12. An expression cassette comprising recombinant DNA as claimed in 
any one of claims 1 and 7 to 11 characterized in that it further 
includes a copy of the lacZ' gene, containing the multiple cloning 
sites of p»frL23. preceded by the promoter DNA of any one of claims 7 
to 11, and followed by tandemly arranged, yeast gene-derived, 
transcriptional terminators. 

13. An expression cassette comprising a DNA sequence SEQ ID 1. 

14. A method for cloning a heterologous gene into an expression 
cassette as claimed in claim 12 or 13 wherein a primer oligonucleotide 
for the heterologous gene is designed having its 5' end corresponding 
to the G residue of the ATG translational start point, and a specific 
sequence amplification is carried out using the primer oligonucleotide 
to isolate the heterologous gene ready for insertion into the cassette. 

15 An F.. coli or <^ -^-^^^ shuttle plasmid comprising an 

expression cassette as claimed in claim 12. or 13 or as provided by a 
method as described in any one of claims 2 to 6 or lit. 
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